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Abstract 

A distributed estimation scheme where the sensors transmit with constant modulus signals over 
a multiple access channel is considered. The proposed estimator is shown to be strongly consistent 
for any sensing noise distribution in the i.i.d. case both for a per-sensor power constraint, and a total 
power constraint. When the distributions of the sensing noise are not identical, a bound on the variances 
is shown to establish strong consistency. The estimator is shown to be asymptotically normal with a 
variance (AsV) that depends on the characteristic function of the sensing noise. Optimization of the 
AsV is considered with respect to a transmission phase parameter for a variety of noise distributions 
exhibiting differing levels of impulsive behavior The robustness of the estimator to impulsive sensing 
noise distributions such as those with positive excess kurtosis, or those that do not have finite moments 
is shown. The proposed estimator is favorably compared with the amplify and forward scheme under 
an impulsive noise scenario. The effect of fading is shown to not affect the consistency of the estimator, 
but to scale the asymptotic variance by a constant fading penalty depending on the fading statistics. 
Simulations corroborate our analytical results. 
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I. Introduction 

In inference-based wireless sensor networks, low-power sensors with limited battery and peak- 
power capabilities transmit their observations to a fusion center (FC) for detection of events or 
estimation of parameters. For distributed estimation, much of the literature has focused on a set 
of orthogonal (parallel) fading channels between the sensors and the FC (please see [1] and the 
references therein). The bandwidth requirements of such an orthogonal WSN scales linearly with 
the number of sensors. In contrast, over multiple access channels where the sensor transmissions 
are simultaneous and in the same frequency band, the utilized bandwidth does not depend on 
the number of sensors. In both cases, sensors may adopt either a digital or analog method for 
relaying the sensed information to the FC. The digital method consists of quantizing the sensed 
data and transmitting with digital modulation over a rate-constrained channel. In these cases, 
the required channel bandwidth is proportional number of bits at the output of the quantizer 
which are transmitted after pulse shaping and digital modulation. The analog method consists of 
transmitting unquantized data by appropriately pulse shaping and amplitude or phase modulating 
to consume finite bandwidth. 

The literature on distributed estimation over multiple access channels has mainly involved 
analog sensor transmission schemes where the instantaneous transmit power is influenced by 
the sensor measurement noise and is not bounded [2]-[8]. In [2], distributed estimation over 
Gaussian multiple access channels is studied from a joint source-channel coding point of view. 
Reference [3] considers optimization of the sensor gains in the presence of channel fading. In [4] 
and [5], the effects of different fading distributions and channel feedback on the performance 
of distributed estimators over multiple access channels is studied. A direct-sequence CDMA 
with amplify and forward (AF) is considered in [6], where the asymptotic MSB is studied. In 
[7], the authors introduce a type-based multiple access scheme where more than one orthogonal 
channel is utilized albeit less in number than the number of sensors. In [8], a likelihood-based 
multiple access approach is introduced. The latter two references do not explicitly estimate a 
location parameter (such as the mean or the median) of the sensed data. In these aforementioned 
schemes, the sensor power management issues arising from the dependence of the instantaneous 
transmit power on the sensing noise have not been addressed. Moreover, for sensors operating 
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in adverse conditions, robustness to impulsive noise is of paramount importance, which has 
not been addressed in the literature in the context of distributed estimation over multiple access 
channels. 

In this work, a distributed estimation scheme is considered where the sensor transmissions have 
constant modulus with fixed instantaneous transmit power. The proposed estimator is universal in 
the sense of [9] (or "distribution-free" in statistical parlance) in that the estimator does not depend 
on the distribution or the parameters of the sensing or channel noise. Unlike the orthogonal 
framework in [9], multiple access channels are considered herein, and the sensing noise is not 
assumed bounded. The estimator is shown to be strongly consistent for any noise distribution, 
including those with no finite moments, in the i.i.d. case. The distribution-free aspect is also 
very useful in heterogenious scenarios where several different kinds of noise are simultaneously 
present, such as additive Gaussian noise along with quantization noise. 

The sensors transmit with constant modulus transmissions whose phase is linear with the 
sensed data. The FC estimates a common location parameter (such as the mean, or the median) 
of the sensed signal where the sensing noise samples are not assumed to be identically distributed, 
or from any specific distribution. It is shown that the proposed estimator is strongly consistent 
even when the sensing noise is not identically distributed, provided that their variances are 
bounded. While the estimator is shown to be consistent in this general framework, the asymptotic 
variance of the estimator is derived for the i.i.d. sensing noise and shown to depend on its 
characteristic function (CF). Upper bounds on, and optimization of the asymptotic variance with 
the transmit phase parameter to is considered for different distributions on the sensing noise 
including impulsive ones. The proposed estimator is compared with AF, where the robustness of 
the proposed estimator is highlighted. The effect of fading is shown to not affect the consistency 
of the estimator, but only to scale the asymptotic variance by a constant fading penalty depending 
on the fading statistics. 

II. System Model 
Consider the sensing model, with L sensors, 

Xi^9 + r)i i^l,...,L (1) 

'referring to distributions whose tails decay slower than that of Gaussian noise 
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where 9 is an unknown real- valued parameter in a bounded interval [0, 9r] of known length. 
Or < oo, rji are a mutually independent, symmetric real-valued noise with zero median (i.e., its 
pdf, when it exists, is symmetric about zero), and Xi is the measurement at the i*^ sensor. Note 
that r]i are not necessarily identically distributed, bounded, and need not have finite moments. 
We consider a setting where the i*'* sensor transmits its measurement using a constant modulus 
signal ^/pe^^^^ over a Gaussian multiple access channel so that the received signal at the fusion 
center (FC) is given by 

L 
1=1 

where the transmitted signal at each sensor has a per-sensor power of p, < a; < 27r/^R is a 
design parameter to be optimized, and v is additive noise. Note that the restriction to G (0, 2tt/9r] 
is necessary even in the absence of sensing and channel noise, to uniquely determine 6 from y^. 
Estimation in a single time snap shot is considered, which is why the time index is dropped. The 
transmitted signal has a deterministic fixed power p which does not suffer from the problems 
of random transmit power seen in AF schemes where the transmitted signal from the i*'* sensor 
is given by axi — a{9 + rji) with instantaneous power per sensor a'^{9 + rjiY, which is an 
unbounded random variable (RV) when rji is. In AF transmission, a is a coefficient which might 
depend on the sensor index, as well as on L through a power constraint, but does not depend 
on Xi [10], [11]. Note that the total transmit power from all the sensors in (2) is pL. We begin 
by considering a fixed total power constraint Pt implying that the per-sensor power p—Px/L is 
a function of L. Later, in Section IV-A, we will also consider a fixed per-sensor power scheme 
where p will not be a function of the number of sensors L, which implies Pr ^ oo as L — > oo. 

III. The Estimation Problem 
We would like to estimate 9 from which under the total power constraint is given by 

yL^e^'^V-^^e^-^^ + T;. (3) 

i=l 

We do not assume that r]i are identically distributed, or that r]i are from any specific distribution 
since a universal estimator which is independent of the distribution of rji is desired. Let, 
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and define '^mi'-jj) '■— E [e^*'^] as the CF of 77^. Due to the law of large numbers we have 

1 ^ 

^{u) := lim j^(fr,iiuj) (5) 



^ L L 
1=1 i=l 



(where indicates convergence almost surely), and we use the fact that the variances var(e-'''*'^) = 
1 — < 1 are bounded to invoke Kolmogorov's strong law of large numbers for non- 

identically distributed RVs [12, pp. 259]. Since rii are symmetric, {</^,,i(c<;)} are real-valued and 
therefore ^p{oo) is also real-valued. 

Consider the conditions under which i^{uj) is a CF, which will be important in the consistency 
of the proposed estimator. Since convex combinations of CFs are CFs [13], the partial sums 
L~^Y^^=iVr)ii''^) well. From the continuity theorem [13, Corollary 1.2.2] if a sequence 

of CFs converges pointwise to a function continuous at a; = 0, then the limit is a CF. Therefore 
i^{oj) in (5) is a CF if </?(a;) is continuous at a; = 0. 

The natural estimator that we will adopt is based on the phase of 2;^: 

where z^:=Ke{zL] and z\^:=\m.{zL}. Note that this estimator does not depend on the distribu- 
tions of r]i or V, as desired. We now establish the strong consistency of the proposed estimator 
9: 

Theorem 1. The estimator 9 in (6) is strongly consistent provided that uo G (0, 271/9^] is chosen 
to satisfy (p{u)) 7^ 0. 

Proof: Taking the real and imaginary parts of (4) and (5) due to the strong law of large 
numbers zf; z^ := \^Pt cos{u 9) ip{u) and zj^ — > := ^/pTsm{uj9)ip{uj) almost surely. 
Since 9 in (6) is a continuous function of [z^ zl], 9 ^ (l/*^) tan~^ (z^ /z^) — 9 almost surely 
[14, Thm 3.14]. We need the assumption that ip{u;) ^ since otherwise 9 cannot be uniquely 
determined from z^ and z^. ■ 
We now investigate when an u; that satisfies the conditions of Theorem 1 exists. Consider 
first the identically distributed case where r]i have a common distribution with a RV 77 so that 
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</?(a;) = ^r){^) is a CF. Many distributions such as Gaussian, Laplace, and Cauchy satisfy 
(Pn{uj) > for all u. If the common sensing noise distribution is known to have this property, 
then any choice of cu E {0,2tc/9ji] would clearly satisfy the conditions of Theorem 1. In the 
more general case, where nothing is known or assumed about i], a sufficiently small cu satisfies 
ip{uj) > since all CFs at the origin are equal to 1 and continuous. So, for identically distributed 
sensing noise, an u; for which (6) is strongly consistent can always be found, even if the sensing 
noise variance does not exist. 

In the general non-identically distributed case, this argument does not follow since (p{uj) is 
not necessarily a CF. However, if is continuous at c<j = 0, it is a CF by the continuity 
theorem [13] and the argument above follows. For an example of when (^{co) is not a CF and 
not continuous at a; 0, consider a case where Yl^i fvii'^) < °° all > such as when rji 
are Gaussian with variances that depend on i linearly: (fimiuj) — e~^<^^/^ where — ia"^, and 
Z^i^i Vmi^) — (1 ~ exp(— (7^a;^/2))~^ < oo by the geometric sum formula. In this case due to 
the factor in (5), ip{u;) — when a; > 0, and ip{0) — 1. For this example, (p{u;) is not a CF 
for any distribution, and there exists no uj that satisfies the requirements of Theorem 1. Clearly, 
this is a very severe case where the sensing noise variance increases linearly with the sensor 
index, without bound. In fact, the example above can be generalized to distributions other than 
Gaussian, and variances going to infinity even slower than linearly. For absolutely continuous 
sensing noise distributions, when rji are expressed as a scalar multiple of an underlying random 
variable, and these scalars (which are proportional to standard deviations when they exist) go to 
infinity, it can be shown that the estimator in (6) is not consistent, which is proved next. 

Theorem 2. Let the sensing noise at the i*'* sensor be a scaled version of a RVrj with absolutely 
continuous distribution so that rji — UiT] and ^mi'-^) — friici^^)- Suppose also that limj_>ooCi = 
oo. Then there is no cu that satisfies the conditions of Theorem 1. 

Proof: Recalling from (5) the definition of (p{u;), we would like to show that (p{u;) :— 
\imL_,^ ^f^^ ipri{aicu) = for a; > 0. Since rj has an absolutely continuous distribu- 
tion, lima;^oo = 0, and because limj^ooO"* = oo, it follows that \imi^oofri{criUj) = 
for o; > 0. From [14, pp. 411] we know that if a sequence satisfies limj^oo = then 
limL^^ J2i=i (^i — 0' which gives us the proof when applied to the sequence ip^iiciUj). 
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The following theorem can loosely be regarded as a converse to Theorem 2 and shows that 
the estimator in (6) is consistent when the variances exist and are bounded. ^ 

Theorem 3. Let var(77j) exist for all i and (Jmax '■— supj(var(77j))^/^ be finite. Then any < a; < 
min(27r/^i?, v^/cTmax) satisfies (p{uj) > 0, thereby fulfilling the requirement of Theorem 1 on uj. 

Proof: From [13, pp. 89] we have (frui^) > 1 — cr?a;^/2 for any CF with finite variance. 
Using (5) we have ip{uj) > 1 — (lim^-^oo L"^ Z^^i '^D^^/'^ > 1 — > where the last 

inequality holds provided that uj < V^/a^ax- Since also a; < 27r/^R we have the theorem. ■ 
The estimator in (6) relies on constant modulus transmissions from the sensors to the FC, and 
is strongly consistent over a wide range of scenarios outlined above. However, the performance 
of 6 will depend on statistical assumptions on {rji} and v. The following theorem characterizes 
this performance, under the assumption that v ~ CN{Q, al) and {rfi} are identically distributed 
with an arbitrary common distribution. 

Theorem 4. ^/L(^-e^ is asymptotically normal with zero mean and variance given by. 



AsV{u) = 



g + 1 - ^,{2uj) 



(7) 



2u;V^(^) 

Proof: Please see Appendix 1. ■ 
Note that in the i.i.d. case (4) is the empirical characteristic function (ECF) [13] of r]i 
corrupted by additive noise. While the ECF has been studied extensively in the statistical 
literature for constructing centralized estimators [13], it has not been addressed in the context of 
communication of samples as in distributed estimation, and therefore issues of power constraint 
and channel noise have not arisen in the literature on parameter estimation with ECFs. 

IV. Analysis and Optimization of the AsV 

The proposed estimator is consistent under general conditions and does not depend on the 
noise parameters. However, if the noise distribution and parameters are available, it is possible 

^It is not a true converse for two main reasons: (i) lirtii^oo at = oo required by Theorem 2 is not the opposite of (Ji being 
bounded, which is required by Theorem 3, since it is possible that neither may occur; (ii) Theorem 2 requires absolute continuity 
whereas Theorem 3 does not. 
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to minimize the AsV with respect to cu over the interval (0, 2t: /6r 

AsV* inf 



^2 

+ 1 - tpri[2ljj) 



We will consider this problem with both per-sensor, and total power constraints. 



(8) 



A. Per-sensor Power Constraint 

Our derivation for the estimator 9 in (6), its strong consistency in Theorem 1, and the 
asymptotic variance in (7) had assumed that Pt is fixed as a function of L. In the fixed per- 
sensor power constraint case the total power Pt — pL increases linearly with L in which case 
the estimator is given in (6) with zl :— i/l/L which we redefine with an extra factor of l/VZ in 
(4). In this case, the statement of Theorem 1 still holds exactly, with minor modifications in the 
proof, and a^/ Pt ^ as L ^ oo. Hence, having a per-sensor power constraint is asymptotically 
equivalent to having no channel noise. In either case (8) becomes, 

^,v^;^^= i„f l^-fff , (9) 

which is a special case of (8). The reason we consider this case separately is because, as we will 
see, the objective in (9) is bounded near the origin which makes the solution of (9) considerably 
different than that of (8). We now consider solving (9), and investigate the behavior of AsV{u;) 
near the origin to see under what conditions small u; will yield optimum performance. Using 
r Hospital's rule, it is seen that liuiij^o AsV (u;) = the variance of r], when 77 has finite 
variance. In fact, when also the fourth moment /i4 of r] exists, we have a stronger result: 

Theorem 5. If the first four moments of r) exists, then AsV{u!) in equation (9) satisfies 

AsViiv) = al- ^Kr^ay + o{u^) (10) 
as a; — > 0, where Kjj :— /i^/a^ — 3 is the excess kurtosis ofrj. 

Proof: We have already established that the first term in (10) is a^. Using the Maclaurin 
series expansion of (^r,(a;) in terms of the second and fourth moments of r], the numerator and 
denominator of (9) can be expressed as N{u;) :— ^a^u;^ + (2/3)1140;^ + o{u;^) and D(u;) :— 
2J^ (1 — (l/2)(7^a;^ -I- (//4/24)a;^ -|- 0(0;^)), respectively. By taking the second derivative and 
evaluating we have 



du^ D{u) 



u}=0 



-^/X4 + 2aJ. (11) 
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Dividing by 2! we obtain the coefficient of oJ^ in the Maclaurin series, as given in (10). ■ 
Theorem 5 has some interesting implications. By making lo sufficiently small, we can obtain 
an AsV that is arbitrarily close to ci^. Also, if the excess kurtosis of the sensing noise is 
positive, it is possible to improve the AsV to a value smaller than cr^ by increasing uj in the 
neighborhood of 0, which shows that if ac^ > 0, (9) satisfies AsV*^^^ < cr^. This is the case 
for impulsive distributions like the Laplace distribution where = 3. When 77 is Gaussian, the 
excess kurtosis Hr, — and therefore it is not clear from (10) if AsV^^^ < is possible, since 
(10) only applies near u; — 0. The following theorem sheds more light on this issue. 

Theorem 6.1fr) is Gaussian then the best asymptotic performance for 9 in (6) for the per-sensor 
power constraint satisfies AsV*^^^ — a^. 

Proof: Equation (10) shows that lim(j_>o AsV{u;) — which implies that AsV*gp^ < a^. 
To see that AsV*gp^ > consider a benchmark genie-aided sample mean estimator Oqa — 
^f^i Xi that has access to the sensor measurements {xj}^^, rather than the the normalized 
channel output in (4). The sample mean which has an asymptotic variance of achieves the 
Cramer Rao bound (CRB) for an estimator of 9 from {xi}f^^ since it is an efficient estimator 
of the mean when i] is Gaussian. Since 9 {xi}f^^ forms a Markov chain, from the data 

processing inequality for the CRB [15], the CRB for estimators of 9 based on zl is at least that 
obtained for the genie-aided setup of estimating 9 from which is cr^. Therefore, the best 

achievable performance in the per-sensor power case cannot be better than that of ^ga> which 
implies As\/*p, > a^. ■ 

Note that in the proof of Theorem 6 we used the Gaussianity only to assert that the sample 
mean achieves the CRB. Theorem 6 also holds for any other distribution with this property. 

hi Figure 1 we show the analytical expressions for AsV {to) for various distributions. For the 
Cauchy distribution the performance is unbounded at the origin since the variance does not exist. 
For all other distributions, we selected cr^ = 1, which is the value of AsV{uj) near the origin. 
Note that the Laplace distribution which has a positive excess kurtosis corresponds to an AsV 
which is decreasing near the origin, as predicted by (10), whereas the Gaussian and uniform 
distributions are increasing near the origin from their infimum value of = 1. 

To conclude, for this per-sensor power constraint case, small uj yields good asymptotic per- 
formance which does not depend on 9r. The performance can be improved by appropriately 
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increasing u in the neighborhood of a; = when 77 is from an impulsive distribution with 
positive excess kurtosis. 



B. Total Power Constraint 

In this case Pt is not a Unear function of L as it was in the previous section, but a constant 
so that the AsV is given by (7). Note that small u should be avoided in the solution of (8) since 
lim^_>o AsV{uj) = 00 is no longer finite, as seen also in Figure 2 for various fading distributions. 
For the same reason, one may use min instead of sup in (8) for the total power constraint case, 
since the minimum is always achieved by a strictly positive uj, when al > 0. 

1) Upper bound on AsV: In what follows, we use the lower bound (^ni^) > 1 — o"^^^/2 in 
order to upper bound AsV* in (8). We have the following theorem which applies when 9r is 
large enough so that {2T^a^)/9R < \/2. 

Theorem 7. The best achievable performance AsV* in (8) for any sensing noise distribution 
with finite variance satisfies 



AsV* < 



whenever c/8 < {2Ti/6Rfa'^ < 2, where c := -'ial/Px -f 
other hand, when (27r/^R)^cr^ < c/8 then, 



(12) 



K/yi^)V32 + 9a2/PT. On the 



AsV* < 



2 1 £)■ 



(13) 



Proof: Please see Appendix 2. ■ 
Note that if the range of the unknown parameter determined by Or is large, the upper bound in 
(13) will be tighter since the bound </?,,(a;) > 1 — (T^a;^/2 is tighter when u; is small. Moreover, 
when 9r is large, (13) simplifies to {al/ Pt){6\/S'k'^) + This shows that if the range Or 
increases, the optimal achievable performance AsV* increases as well. In addition to large Or, 
when ull Pt as in the per-sensor power constraint case, the bound further simplifies to 
AsV* < a^. The bound in Theorem 7 holds regardless of the distribution on 77 as long as it has 
finite variance. 
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Instead of working with bounds, if exact solutions to (8) are desired, then it is necessary to 
specify the sensing noise distribution. In what follows, the problem is specialized by considering 
some common distributions. The resulting asymptotic variances for the different distributions are 
illustrated in Figure 2. 

2) Gaussian Sensing Noise: In this case, we have </?,,(a;) = exp {—a'^uj'^ /2) so that 



AsVg{uj) 



2a;2 



(14) 



We would like to minimize (14) over u; e (0, 27: /9r] as in (8). As an intermediary step, we 
first characterize the unconstrained minimum over u; e [0, oo). To simplify (14) we substitute 
P cr^a;^. Note that the value of uj that minimizes (14) over a; > is related to the /3 > that 
minimizes AsVciy/P / a^j) through u = y/j3/a^. Differentiating with respect to /5, we have. 



dAsVa(.V^/(Tr^ 



^ + 1 (/? - 1) r:2« + /7 + 1 



dp 2/32 
Any stationary point of AsVoiy/P/o-jj), with respect to P satisfies. 



^ + l){P-l)e'P + {P + l) 



. 



(15) 



(16) 



Let any solution to ( 1 6) be denoted as Pq . It is straightforward to show that d'^AsVc ( \/p /(J-q)/ dp"^ \ f 
is positive. This proves that P^ is the unique unconstrained minimum of AsVGi\^ / a^j) ^^^^ 
P > which in turn implies that luq = \fpQj(yr] is the corresponding unique minimizer of 
AsVq,{lo) for a; > 0. Since AsVq.{lS) has a unique minimum, it is monotonically decreasing over 
io e (0, ^fP^/cTr)]- The solution to (8) in the Gaussian case therefore is 



2ti 



UJ. 



G 



mm 



e 



(17) 



R 



where Pq is the unique solution to (16). 

While there has been some efforts in the physics community [16] to define functions that 
solve the intersection point of rational functions and exponentials as in (16), there is no widely 
accepted formula. But (16) can be easily solved numerically to optimize u; when rj is Gaussian. 

3) Cauchy Sensing Noise: For the Cauchy distribution </?^(a;) = e~'^^ for a; > 0. It is well 
known that no moments of this distribution exists. Substituting (pj,{uj) in (7), we have 



AsVciuj) 



2^ 



-2'yuj 



(18) 
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As in the Gaussian case, we first find the stationary points of (18) on a; e [0, oo) by taking the 
derivative of (18) and equating to zero to obtain. 



1 

2^ 



2 + W 



-2B 



(19) 



erf, + Pt 

where W{-) is the Lambert function defined to be the inverse function of xe^. It can be verified 
that AsV"{/3^) > and therefore /3q is the unique unconstrained minimum of AsVc{ijj). Hence, 
AsVc{u;) has a unique minimum over a; > 0, and the solution to (8) in this case is 

'271 



c 



(20) 



4) Laplace Sensing Noise: In this case, we have </?r,(a;) = (1 + h'^uJ^) ^ where 6^ := (7^/2. 
Substituting /3 h'^u'^ for convenience, (7) for Laplace noise becomes. 




4/3 



Pt 1 + 4/3 



(21) 



62 / " 2/3 

To characterize the stationary points of (21), we take the derivative with respect to /3 and 
equate to zero. The optimum value is the root of a 4*^* order polynomial. Using the only solution 
with a positive root we have. 



where 



c = 



125 -f +258 -f +141 



12 Vf + 1 



+ 



25^ + 4 



(22) 



+8 



1/3 



It is also possible to verify that the second derivative is positive at the optimal point. To 
express the roots of the 4*'* order polynomial in closed-form and verifying that the second 

derivative is always positive, we have used Mathematica. Using (22), u'^lP' = (3, and the fact 
that Lo e (0, 2t:/9r\ we have the solution to (8) as 



mm 



eRh 



(23) 
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J) Uniform Sensing Noise: We now assume that rj is uniformly distributed on [—a, a], so 
that — 3(7^. In this case (pr^{u!) = sm{uja)/{uja) and we need to optimize. 



AsVjj{uj) 



,2 



a" f sm{2uja) 



— + 1 

Pt V 2uja 



(24) 



2 sin^(cja) 

over uj e (0, 2tt/9r]. Note that AsV\j{uj) is undefined at o; = ir/a. We begin by showing that the 
range of u; can be further reduced to a; e (0, min(27r/^fl, 7r/a)) in solving (8). This is because 
both sin^(a;a) and sin(2a;a) are periodic with period n/a, and therefore due to the 2u;a term in 
the denominator, AsVu{u;) < AsVu{u; + kn/a) for any positive integer k, and a; > 0. 

In order to minimize (24) over lo G (0, min(27r/^R, it /a)), we first disregard the constraint on 
to imposed by 9^., and focus on a; e (0, vr/a). Substituting /3 <— ua, differentiating AsV\]{j3/a) 
with respect to {3 and equating to zero we obtain 



a2 



cos(/3) + cos(3/3) - 4/3 sin(/3) = . (25) 



By taking the second derivative, it can be verified that of AsV\j{l3/a) is convex, and therefore 
(25) has a unique solution /^^ over /3 e (0,7r) corresponding to the unique minimum of 
AsV\]{l3/a) over the same interval. It is immediate that u; — P^/a is the unique minimum of 
(24) over lo G (0, tt/o), and therefore (24) is a monotonically decreasing function over (0, P^/a). 
Incorporating the effect of Or, we have that if 2tt/9r < then the minimum of (24) over 
UJ e (0, min(27r/^^i^, 7r/a)) is attained at a; = 2t:/9r, and if 2t:/9r > then it is attained at 

u; — P^/a.In short. 

Note that a closed-form solution to (25) is not possible, however a numerical solution can be 
easily found. Recall also from Section IV that for uniform noise which has = — 6/5, a small 
a; > should be chosen when ct^/Pt — 0. If instead o-^/Pt > 0, then u; ^ 7r/2a (or 2t^/9r, 
whichever one is smaller) is a good choice. We will elaborate on this more in Section IV-C, 
where we consider the low channel SNR regime. 

6) Compound Gaussian Sensing Noise: Compound Gaussian is a class of RVs which when 
conditioned on the variance is a Gaussian RV. So when r] is compound Gaussian, it can be 
written as 77 = y/XG where G is a Gaussian RV with zero mean and variance one, and X is 
a positive RV. It is easy to show that the CF of rj can be expressed in terms of the moment 



September 29, 2009 



DRAFT 



14 



generating function (MGF) of X: 

1 



iPr,{uj) = E 



^Mx[-'-uj^\ (27) 



where Mx{t) ■= E[e^^] is the MGF of X when the expectation exists. Note that E[X] = 
in general and if the CDF of X is a unit step at then rj is Gaussian with variance cr^. For 
compound Gaussian sensing noise, (27) can be substituted in (7) to obtain 



(28) 



2a;2M2(-ia;2) 
whenever the MGF exists. 

When the per- sensor power is fixed so that a^/ Pt ^ as L ^ oo, (28) can be expanded 
near a; = to obtain, 

AsVcg{oo) = E[X] - a\uy' + o{u^) (29) 

which is the same as (10), expressed in terms of the mean and variance of X. When g\ = 0, 
X is a constant and i] is Gaussian. If instead aj^ > 0, then AsV can be improved by increasing 
u; in the neighborhood of 0, implying that AsV*^^^ < a^. 

As a concrete example, consider Middleton Class- A noise [17] where the variance RV is dis- 
crete and given by X ^ [Y/{A{T + 1)) + T/{T + 1)], A and T are deterministic parameters 
controlling the impulsiveness of the noise rj, and y is a Poisson RV with parameter A. In this 
case, 

MAt) = axp (t^) axp (a (axp (^i^^) - l)) . (30) 

Substituting in (28) we obtain the AsV. The resulting expression shows that when T = (highly 
impulsive noise) AsVcci^) ^ as c<j ^ oo in which case cu should be chosen as large as possible 
(i.e., u = 2t^/9r). Another interesting aspect of this expression is that it illustrates that AsV{uj) 
need not have a unique local minimum (i.e., it need not be convex or quasi-convex) for every 
sensing noise distribution. In fact, as will be seen in Figure 5 of the Simulations section, AsV{uj) 
can have multiple local minima, unlike the Gaussian, Cauchy and Laplace cases considered thus 
far. 
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C. Low Channel SNR Regime 

When (Ty/Pr is sufficiently large, the (^^(2u;) term in (7) is negligible, thereby transforming 
the problem in (8) into maximizing over {0,27i/9ji]. We now briefly summarize how 

the solutions in the previous subsection simplify in this regime. Since we already have closed 
form expressions for the solution of (8) for the Cauchy and Laplace cases, we only focus on 
the Gaussian and uniform cases. 

For the Gaussian case maximizing a;^e~^^'^^ over co G (0, 27r/ 9r] yields cu* — min {2n /9R,l/ar,). 
If Or is sufficiently small so that u* = 1/cr^, then we have 



which is an upper bound on the best achievable performance AsV*, even when the channel SNR 
is not low, but becomes tighter at low channel SNR. 

For the uniform case we maximize sin^(a;a) which yields u;* — min {2t:/9r, T:/2a). If Or is 
small enough, uj* = 7r/(2a) and AsVv{T:/2a) = {a? /2){(tI/ Pt). 



In the AF scheme, the transmitted signal at the i sensor is aiXi where depends on 
the number of sensors L to maintain the total power constraint, but is independent of Xi [10], 
[11]. We focus on the i.i.d. case for simplicity, and choose aL identical across sensors due to 
symmetry. In what follows, we will show that the asymptotic performance of AF is competitive 
with that of the proposed scheme when the sensing noise has finite variance, and inferior to the 
proposed scheme when the sensing noise is impulsive. 

The received signal for AF is. 



We have already alluded to the fact that the per-sensor power a\{0 + 77^)^ is an unbounded 
RV, when the pdf of the sensing noise has infinite support. This is undesirable especially for 
low-power sensor networks with limited peak-power capabilities. Therefore, before we compare 

the asymptotic variances of the proposed estimator and AF, we reiterate that with respect to the 
management of the instantaneous transmit power of sensors, the proposed estimator is preferable 




(31) 



V. Comparison with Amplify and Forward Scheme 



L 




(32) 



1=1 



to AF. 
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Since the total instantaneous power is random for AF, the total power is defined as an average 
Pt = a\ Yld=i + with respect to the sensing noise distribution. We will consider a 
total power constraint case where Pt is not a function of L so that ul = ^J^^^^^ = 0{L~^/'^). 

The estimator in AF is given by Oaf = ^^{vl} / {LaL) so that 

1 ^ / 0'^ ~\~ (j^ 

^{Oaf -0) = -7^Y.'^^ + )J — pT^ Mv} (33) 
^ 1=1 V 



with an AsV of. 



AsVj,^ ^ a'^ + ^(9' + a'^) (34) 



T 

when 7] has finite variance. 

Consider now the special case of no channel noise (cr^ — 0) which implies AsVap — cr^. In 
Section IV-A we have seen that AsV*^^^ < is possible when the sensing noise is impulsive 
enough to have a positive excess kurtosis, the proposed approach outperforms AF when there is 
no channel noise. We now examine the more general case of > 0. 

Observe that (34) depends explicitly on 9, whereas (8) depends on the estimation range 9r. 
Since it is difficult to compare these expressions in general, we will examine the case of large 
and small 9. When 9 is large, AsVaf ~ ct^ + (^^/-Pt) (6*^/2), and by the discussion after (13), 
AsV* ~ (7^ + {a^/PT){9ji/87r^). Note that when the parameter 9 is close to its upper limit, the 
proposed estimator will outperform AF. However, when 9 is very small despite a large range 
9r, the AF will outperform the proposed approach. 

Let us now examine the case of small 9 and 9ji, where we focus on the Gaussian case. For 
this purpose, we bound the difference in performance between the proposed estimator and AF: 



(36) 



AsV*-AsVaf < AsVG{l/crr,)-AsVAF (35) 

where the inequality is because (31) is an upper bound on AsV*. Examining the bound in (36) 
we note that its sign depends on the the channel SNR ct^/Pt and the sensing SNR 9'^/a^. In 
conclusion, the proposed approach is competitive with AF and may outperform it, depending on 
the specific parameter values when the sensing noise has finite variance. In what follows, the 
heavy-tailed sensing noise case is discussed. 

With the AF approach the normalized multiple access channel output is proportional to the 
the sample mean, which is not a good estimator of 9 when the sensing noise is heavy-tailed. To 
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illustrate with a specific example, consider the case when rj is Cauchy. Dividing both sides of 
(33) with vT it is clear that {6af — ^ is not possible since the sample mean Z]f=o 
is Cauchy distributed and has the same distribution as r]i regardless of the value of L. Since the 
sample mean is not a consistent estimator for Cauchy noise, the AF approach over multiple access 
channels fails for such a heavy-tailed distribution. On the other hand, the proposed estimator 
is strongly consistent in the presence of any noise distribution, including Cauchy. This brief 
example illustrates that the inherent robustness of our approach in the presence of heavy-tailed 
sensing noise distributions. The sample mean, "computed" by the multiple access channel in 
the AF approach, is highly suboptimal, and sometimes not consistent like in the Cauchy case, 
whereas in the proposed approach the channel computes (a noisy and normalized version of) the 
empirical characteristic function of the sensed samples, from which a consistent estimator can 
be constructed for any sensing noise distribution. 

To be fair to AF, even though it suffers from having potentially large peak powers, we also 
want to point out the situations under which it is preferable to the proposed approach. The first 
point is that AF does not require the parameter 9 to be bounded, and it does not require fine- 
tuning of a transmission parameter like u. Moreover, AF is also a "universal" estimator, albeit 
over a smaller class of distributions (those that have finite variance) for the sensing noise. 

In conclusion, the proposed estimator with its fixed instantaneous power per sensor is inher- 
ently preferable to AF when the sensors have a small dynamic range. Moreover, for AF, the 
total transmit power depends on 9 and the statistics of the sensing noise. On the other hand, 
the AF approach has the benefit of not assuming 9 to be in a finite set, and sometimes has a 
better finite sample performance as seen in the simulations. For impulsive noise distributions 
with finite variance and positive excess kurtosis like Laplace, or heavy-tailed distributions with 
infinite variance like Cauchy, the proposed approach is superior to AF. For other regimes, the two 
schemes are competitive and their asymptotic performance comparison depends on the specific 
values of parameters 9, 9r a^, a'i, and Pt- 
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VI. Fading Channels 

Suppose that the multiple access channel connecting the sensors to the FC has fading so that 
(2) becomes 

L 

yL = + (37) 

i=\ 

where \hi\ is the amplitude of the channel coefficient hi between the i*'^ sensor and the FC 
satisfying E[|/ijp] — 1. Even though the channel hi is complex valued, the effective channel \hi\ 
is real and positive when the i*'* sensor corrects for the channel phase before transmission, using 
local channel phase information. Such a phase correction does not change the constant power 
nature of the transmission. 

The following Theorem characterizes the performance of the proposed estimator over fading 
channels: 



Theorem 8. For the channel in (37) the estimator 6 in (6) is asymptotically normal with variance 

Pt 



AsV{u) = {E[\hi 



I: + 1 - cp^{2uj) 



(38) 



2c.V^(cc;) 

Proof: The proof is similar to that of Theorem 1 with the following changes: 

vc = i + iy^.M - (E[|/i,|])V?M 

and both Gi and G2 are scaled by a factor of (E[|/ij|])~^. Substituting these in (42) we obtain 
(38). ■ 
Since E[|/ijp] = 1, using Jensen's inequality, the (E[|/ij|])^^ factor due to fading is always less 
than one, unless \hi\ is deterministic. In fact, when \hi\ is Ricean the loss due to fading is given 
by {VK+1 r(3/2)e-^ iFi(3/2; 1; i^))-^ where is the confluent hypergeometric 

function [18, pp. 504] and K is the Ricean parameter. This expression reduces to 4/7r when 
K — 0, implying Rayleigh fading channels. In the AF setting, the difference between fading and 
no fading also exhibits the same loss, which was analyzed in detail in [4], [5] for different fading 
distributions, where the Nakagami case was also considered. Note that if the optimization of the 
asymptotic variance is desired in the fading case, the fading loss does not affect the optimum 
value of uj so equations (17), (20), (23), and (26) remain valid for the different sensing noise 
distributions. 
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VII. Simulations 

In what follows, we corroborate our analytical results through Monte Carlo simulations, and 
also examine finite-sample effects that are not predictable from our asymptotic results. 

In Figures 1 and 2 we compare AsV{u!) and Lvar(^ — 9) versus cu for the per-sensor, and total 
power constraints, respectively. We begin by acknowledging that the variance of the asymptotic 
distribution, AsV{u;), and the normalized limiting variance Lyqx{9 — 9) are not always equal 
in general [19, pp. 437]. However, as the next two figures show, they are in agreement for the 
proposed estimator. The mismatch that occurs for small uj are due to the number of samples not 
being sufficiently large for both Figures 1 and 2. To focus more on this mismatch, in Figures 3 
and 4 we consider smaller values of L, and an increased range for uj for the Gaussian sensing 
noise case. As expected, for reduced values of L the mismatch increases, especially for small, 
and large values of u. Note that for the per-sensor power constraint case, although AsV{uj) is 
bounded near the origin, with finite samples, Lvaic{9 — 9) is large for small cv, an effect which 
is more pronounced for small L. This is suggests that for the per-sensor power constraint case 
u! should not be chosen arbitrarily small, especially when L is small, to avoid this finite-sample 
artifact. 

In Figure 5 we compare AsV{u;) and Lvar(^ — 9) versus u; for the per-sensor, and total 
power constraints, respectively, for Middleton Class A noise. In addition to the agreement of the 
theory and simulations, these plots illustrate that AsV{u;) need not be a convex, or a quasiconvex 
function of uj with a unique local minimum. For all the other noise distributions, AsV{u;) did 
exhibit a unique local minimum, which was helpful in finding the optimal value of cu. 

Figures 6 and 7 show Lvar(^ — 9) versus L for the per-sensor, and total power constraints, 
respectively. The optimal value of u; that minimizes the AsV is chosen for the total power 
constraint case. For the per-sensor power constraint case in Figure 6, we did not use the minimizer 
of AsV{u;) due to the aforementioned finite-sample effects. Instead, the value of u; is chosen to 
minimize Lvar(^ — 9) in Figure 1 (which assumes L = 500) and applied to all values of L in 
Figure 6. It is seen that convergence occurs slower for the heavy-tailed Cauchy distribution. At 
about L = 50, all cases converge for both Figures. Figure 8 illustrates the effect of Rayleigh 
fading on the performance for Gaussian sensing and channel noise. It is seen that Lyar{9 — 9) 
converges to their theoretically predicted asymptotic value with a ratio of about 4/7r compared 
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to the non-fading case. 

In Figure 9 the proposed scheme is compared with AF. The performance of the proposed 
approach is seen to be both better and worse than AF depending on the value of 9. Another 
interesting aspect of Figure 9 is the flatness of the curves for the AF case. This can be seen 
by finding the variance of equation (33), which is a constant function of L. In contrast, the 
normalized variance for the proposed estimator is seen to depend on L in Figure 9. 

To illustrate the robustness of the proposed estimator Figure 10 compares it with AF for 
Cauchy sensing noise. One realization of the estimation error is plotted for each value of L to 
illustrate that in the presence of Cauchy noise, the performance of AF does not converge despite 
the increase in L, whereas the proposed estimator is consistent. 

VIII. Conclusions 

A distributed estimation scheme relying on constant modulus transmissions from the sensors 
is proposed over Gaussian multiple access channels. The instantaneous transmit power does not 
depend on the random sensing noise, which is a desirable feature for low-power sensors with 
limited peak power capabilities. In the i.i.d. case, the estimator is shown to be strongly consistent 
for any sensing or channel noise distribution. In the non-identically distributed case, a bound 
on the variances is shown to be a sufficient condition for strong consistency. The asymptotic 
variance is derived, and shown to depend on the characteristic function of the sensing noise 
which is bounded for the general case, and also optimized with respect to u; for various noise 
distributions. In addition to the desirable constant-power feature, the proposed estimator is robust 
to impulsive noise, and remains consistent even when the mean and variance of the sensing noise 
does not exist. It is argued that over Gaussian multiple access channels, the AF estimator is 
effectively a noisy sample mean of the sensed data. For sensing noise distributions for which the 
sample mean is highly suboptimal or inconsistent, the proposed estimator is shown to outperform 
AF. The effect of fading is also considered, and shown to effect the asymptotic variance by a 
constant fading penalty factor. 

Appendix 1 : Proof of Theorem 4 

We begin by observing that the 2x1 vector sequence, a/L [z^- — 1^ z\^ — z^~\ is asymptotically 
normal with zero mean, due to the central limit theorem. The elements of its asymptotic 
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covariance matrix can be calculated to be, 

2 



E22 := Pt [cos^Lve) ■ + sm^uue) ■ Vc] + ^ 
E12 = S21 := PTsm{u!9) cos{u!9){vc — Vg) 



(39) 

(40) 
(41) 



where, for brevity we have Vc := var(cosct;r/j) = (1/2) + (pr,{2uj)/2 — <f'^{(jj) and Vg :— 
vai {sin (jTji) = (1/2) — (^^(2a;)/2. Applying [14, Thm 3.16] the asymptotic variance is given by 



where 



AsV = G?Sn + 2G1G2S12 + ^^^22 



dx 



tan(a;^) 



Go := 



y 



dy 



uj1 + tan^(a;6') ^/^ip^(u;) cos(u;6) 



y=z' 



tano;^ 



(42) 

(43) 
(44) 



Substituting in (42) and simplifying we obtain the theorem. 



Appendix 2: Proof of Theorem 7 



UJ 



Using the bound, <fni{^) > 1 — o'^i^^/2, we have for all u;, <^,,(2a;) > 1 — 2a^u;'^, and for 
< v^/(7^, ip^{u}) > (1 — cr^a;^/2)^. Substituting in (7) we have for cu < v^/cr^. 



AsV{uj) < 



2a;2 



(45) 



Recall that 2tt/9r < ^fljor^ by assumption. Therefore, upper bound (45) is valid over the entire 
range of uj values which involves the minimization in (8). We can therefore minimize both sides 
of (45) over a; e (0, 2ti/9i^. Substituting for convenience {3 o^^cr^ we have 



AsV* < 



mm 



(46) 



peiH^na^/en?] 2/3(1 - /3/2)2 ^ 
The unconstrained minimum can be found by differentiating (46) and is given by c/8, with a 
corresponding minimum given by the right hand side (rhs) of (12). It can be checked that c/8 is 
the unique minimum of the unconstrained problem. This shows that if (27r(7^/^ij)^ > c/8 then 
the rhs of (46) is given by the rhs of (12). 
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To show (13), recall that c/8 is the unique unconstrained minimum of the objective on the 
rhs of (46). This implies that as a function of P it is non-increasing over (0, c/8) so that when 

{2Traji/9ji)'^ < c/8 the minimum over [0, (27rcr^/6'ij)^] is achieved at /5 = (27r(j^/6'ij)^ which is 
the rhs of (13). 
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Fig. 1. Per-sensor Power Constraint, cr^ = 1, cr^ = 1, L = 500, p = 1 
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Fig. 2. Total Power Constraint, cr^ = 1, = 1, L = 500, Pt = 10 
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Fig. 3. Per-sensor Power Constraint, cr^ = 1, cr^ = 1, p = 1 



Total Power Constraint 




Fig. 4. Total Power Constraint, cr^ = 1, = 1, Pt = 10 
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Total Power Constraint 
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Per Sensor Power Constraint 




Fig. 5. 0-^ = 3, CT^ = 1, L = 500, Pt = 10 
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Fig. 6. Per-sensor Power Constraint, = 1, a-y = 1, p = 1, 9r = 4, 9 = 2 
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AsV(u) 

L var(e - ff) 
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Fig. 7. Total Power Constraint, = 1, = 1, Pt = 10, Or = 12, = 2 
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Fig. 8. Total Power Constraint, E[|/i|2] = 1, = 1, = 1, = 10, = 12, 6> = 2, 
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Fig. 9. Total Power Constraint, = 1, = 1, Pt = 10, Or = 12 




Fig. 10. Total Power Constraint with Cauchy distributed sensing noise, = \, = 1, Pt = 10, Or = 12, 6 = 2 
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